Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Chinese grammatical error correction model based on bidirectional and auto-regressive transformers noiser
Qiujie SUN, Jinggui LIANG, Si LI
Journal of Computer Applications    2022, 42 (3): 860-866.   DOI: 10.11772/j.issn.1001-9081.2021030441
Abstract504)   HTML23)    PDF (625KB)(209)       Save

Methods based on neural machine translation are widely used in Chinese grammatical error correction. These methods require a large amount of annotation data to guarantee the performance, which is difficult to obtain in Chinese grammatical error correction. Focused on the issue that the limited size of annotation data constrains Chinese grammatical error correction system’s performance, a Chinese Grammatical Error Correction Model based on Bidirectional and Auto-Regressive Transformers (BART) Noiser (BN-CGECM) was proposed. Firstly, to speed up model convergence, Chinese pretraining language model based on BERT (Bidirectional Encoder Representation from Transformers) was used to initialize the parameters of BN-CGECM’s encoder. Secondly, a BART noiser was used to introduce text noise to the input samples in the training process to automatically generate diverse noisy data, which was used to alleviate the problem of limited size of annotation data. Experimental results on NLPCC 2018 dataset demonstrate that the proposed model achieves F0.5 by 7.14 percentage points higher than that of the Chinese grammatical error correction system proposed by YouDao, and 6.48 percentage points higher than that of the Chinese grammatical error correction ensemble system proposed by Beijing Language and Culture University (BLCU_ensemble). Meanwhile, the proposed model enhances the diversity of the original data and converges faster without increasing the amount of training data.

Table and Figures | Reference | Related Articles | Metrics